You're out of free questions.

Upgrade now

Your company delivers breakfast via autonomous quadcopter drones. And something mysterious has happened.

Each breakfast delivery is assigned a unique ID, a positive integer. When one of the company's 100 drones takes off with a delivery, the delivery's ID is added to a list, delivery_id_confirmations. When the drone comes back and lands, the ID is again added to the same list.

After breakfast this morning there were only 99 drones on the tarmac. One of the drones never made it back from a delivery. We suspect a secret agent from Amazon placed an order and stole one of our patented drones. To track them down, we need to find their delivery ID.

Given the list of IDs, which contains many duplicate integers and one unique integer, find the unique integer.

The IDs are not guaranteed to be sorted or sequential. Orders aren't always fulfilled in the order they were received, and some deliveries get cancelled before takeoff.

Gotchas

We can do this in O(n)O(n) time.

No matter how many integers are in our input list, we can always find the unique ID in O(1)O(1) space!

Breakdown

A brute force approach would use two nested loops to go through every ID and check every other ID to see if there's a duplicate.

This would take O(n2)O(n^2) time and O(1)O(1) space. Can we bring that runtime down?

Well, we know every integer appears twice, except for one integer, which appears once. Can we just track how many times each integer appears?

We could iterate through the list and store each integer in a dictionary, where the keys are the integers and the values are the number of times we’ve seen that integer so far. At the end, we’d just need to return the integer we saw one time.

  def find_unique_delivery_id(delivery_ids):
    ids_to_occurrences = {}

    for delivery_id in delivery_ids:
        if delivery_id in ids_to_occurrences:
            ids_to_occurrences[delivery_id] += 1
        else:
            ids_to_occurrences[delivery_id]  = 1

    for delivery_id, occurrences in list(ids_to_occurrences.items()):
        if occurrences == 1:
            return delivery_id

Alright, we got our runtime down to O(n)O(n). That's probably the best runtime we can get—to find our unique integer we’ll definitely have to look at every integer in the worst case.

But now we've added O(n)O(n) space, for our dictionary. Can we bring that down?

Well, we could use booleans

Boolean is a data type that has only two possible values: true and false.

Because an individual bit also has just two possible values (0 and 1), it's tempting to think that a boolean is stored in just 1 bit. But in practice, booleans often take up a whole byte (8 bits). This is because a byte is usually the basic addressable unit in a computer. In other words, usually the computer doesn't grab individual bits in memory—instead it grabs "chunks" of 8 bits or more at a time.

Sometimes booleans take up as much space as an integer, which is often 32 or 64 bits in modern computers. In fact, in some languages, like Python, booleans are subclasses of integers.

as our values, instead of integers. If we see an integer, we'll add it as a key in our dictionary with a boolean value of True. If we see it again, we'll change its value to False. At the end, our non-repeated order ID will be the one integer with a value of True.

How much space does this save us? Depends how our language stores booleans vs integers. Often booleans take up just as much space as integers. In fact, in Python, booleans are a subtype of integers.

And even if each boolean were just 1 bit, that'd still be O(n)O(n) space overall.

So using booleans probably doesn't save us much space here. Any other ideas?

Let’s zoom out and think about what we’re working with. The only data we have is integers. How are integers stored?

Our machine stores integers as binary numbers using bits.

At a deep level, your computer works with bits—1s and 0s. But these get bundled up into more human readable things, like characters and lists. This is called abstraction.

While abstractions are nice, sometimes we want to work directly with bits. You can do this with bit manipulation, which involves bitwise operations.

Bitwise operations include AND, OR, XOR, NOT, and Bit Shifts.

AND, for example, takes two bits and returns 1 if both bits are 1. Otherwise, it returns 0. OR returns 1 if either of the bits are 1.

  # AND
1 & 1  # gives 1
0 & 1  # gives 0
0o010 & 0o111  # gives 0010

# OR
0 | 1  # gives 1
0 | 0  # gives 0
1001 | 0o100  # gives 1101
What if we thought about this problem on the level of individual bits?

Let's think about the bitwise operations AND,

The AND operation takes two bits and returns 11 if both bits are 11. Otherwise, it returns 00.

  1 & 1  →  1
1 & 0  →  0
0 & 1  →  0
0 & 0  →  0

Think of it like a hose with two knobs. Both knobs must be set to on for water to come out.

When performing AND on two integers, the AND operation is calculated on each pair of bits (the two bits at the same index in each number).

  5 & 6  # gives 4

# At the bit level:
#     0101  (5)
#   & 0110  (6)
#   = 0100  (4)
OR,

The OR operation takes two bits and returns 11 if either of the bits are 11. Otherwise, it returns 00.

  1 | 1  →  1
1 | 0  →  1
0 | 1  →  1
0 | 0  →  0

Think of it like a bucket with two holes in it. If both holes are closed, no water comes out. If either hole is open, or if both are open, water comes out.

When performing OR on two integers, the OR operation is calculated on each pair of bits (the two bits at the same index in each number).

  5 | 6  # gives 7

# At the bit level:
#     0101  (5)
#   | 0110  (6)
#   = 0111  (7)
XOR,

The XOR operation (or exclusive or) takes two bits and returns 11 if exactly one of the bits is 11. Otherwise, it returns 00.

  1 ^ 1  →  0
1 ^ 0  →  1
0 ^ 1  →  1
0 ^ 0  →  0

Think of it like a bag of chips where only one hand can fit in at a time. If no one reaches for chips, no one gets chips, and if both people reach for chips, they can't fit and no one gets chips either!

When performing XOR on two integers, the XOR operation is calculated on each pair of bits (the two bits at the same index in each number).

  5 ^ 6  # gives 3

# At the bit level:
#     0101  (5)
#   ^ 0110  (6)
#   = 0011  (3)
NOT

The NOT bitwise operation inverts bits. A 00 becomes a 11. A 11 becomes a 00.

The NOT operator is often written as a tilde character ("~"):

  ~ 0000 0101
= 1111 1010

When numbers are printed in base-10, the result of a NOT operation can be surprising. In particular, positive numbers can become negative and negative numbers can become positive. For example:

  ~ 5  # gives -6

# At the bit level:
#   ~ 0000 0101  (5)
#   = 1111 1010  (-6)

This is because numbers are (usually) represented using two's complement, where the leftmost bit is actually negative. So flipping the leftmost bit usually flips the sign of the number.

and bit shifts.

A bit shift moves each digit in a number's binary representation left or right. There are three main types of shifts:

Left Shifts

When shifting left, the most-significant bit is lost, and a 00 bit is inserted on the other end.

The left shift operator is usually written as "<<".

  0010 << 1  →  0100
0010 << 2  →  1000

A single left shift multiplies a binary number by 2:

  0010 << 1  →  0100

0010 is 2
0100 is 4

Logical Right Shifts

When shifting right with a logical right shift, the least-significant bit is lost and a 00 is inserted on the other end.

  1011 >> 1  →  0101
1011 >> 3  →  0001

For positive numbers, a single logical right shift divides a number by 2, throwing out any remainders.

  0101 >> 1  →  0010

0101 is 5
0010 is 2

Arithmetic Right Shifts

When shifting right with an arithmetic right shift, the least-significant bit is lost and the most-significant bit is copied.

Languages handle arithmetic and logical right shifting in different ways. Python's right shift operator (>>) always does arithmetic right shifting.

  1011 >> 1  →  1101
1011 >> 3  →  1111

0011 >> 1  →  0001
0011 >> 2  →  0000

The first two numbers had a 11 as the most significant bit, so more 11's were inserted during the shift. The last two numbers had a 00 as the most significant bit, so the shift inserted more 00's.

If a number is encoded using two's complement, then an arithmetic right shift preserves the number's sign, while a logical right shift makes the number positive.

  # Arithmetic shift
1011 >> 1  →  1101
    1011 is -5
    1101 is -3

# Logical shift
# (Not a thing in Python, but if it were:)
1111 >> 1  →  0111
    1111 is -1
    0111 is  7

Is one of those operations helpful for finding our unique integer?

We’re seeing every integer twice, except one. Is there a bitwise operation that would let the second occurrence of an integer cancel out the first?

If so, we could start with a variable unique_delivery_id set to 00 and run some bitwise operation with that variable and each number in our list. If duplicate integers cancel each other out, then we’d only be left with the unique integer at the end!

Which bitwise operation would let us do that?

Solution

We XOR

The XOR operation (or exclusive or) takes two bits and returns 11 if exactly one of the bits is 11. Otherwise, it returns 00.

  1 ^ 1  →  0
1 ^ 0  →  1
0 ^ 1  →  1
0 ^ 0  →  0

Think of it like a bag of chips where only one hand can fit in at a time. If no one reaches for chips, no one gets chips, and if both people reach for chips, they can't fit and no one gets chips either!

When performing XOR on two integers, the XOR operation is calculated on each pair of bits (the two bits at the same index in each number).

  5 ^ 6  # gives 3

# At the bit level:
#     0101  (5)
#   ^ 0110  (6)
#   = 0011  (3)
all the integers in the list. We start with a variable unique_delivery_id set to 00. Every time we XOR with a new ID, it will change the bits. When we XOR with the same ID again, it will cancel out the earlier change.

In the end, we'll be left with the ID that appeared once!

  def find_unique_delivery_id(delivery_ids):
    unique_delivery_id = 0

    for delivery_id in delivery_ids:
        unique_delivery_id ^= delivery_id

    return unique_delivery_id

Complexity

O(n)O(n) time, and O(1)O(1) space.

What We Learned

This problem is a useful reminder of the power we can unlock by knowing what's happening at the "bit level."

How do you know when bit manipulation might be the key to solving a problem? Here are some signs to watch out for:

  1. You want to multiply or divide by 2 (use a left shift to multiply by 2, right shift to divide by 2).
  2. You want to "cancel out" matching numbers (use XOR).

Do you have an answer?

Wanna review this one again later? Or do you feel like you got it all?

Mark as done Pin for review later
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import unittest
def find_unique_delivery_id(delivery_ids):
# Find the one unique ID in the list
return 0
# Tests
class Test(unittest.TestCase):
def test_one_drone(self):
actual = find_unique_delivery_id([1])
expected = 1
self.assertEqual(actual, expected)
def test_unique_id_comes_first(self):
actual = find_unique_delivery_id([1, 2, 2])
expected = 1
self.assertEqual(actual, expected)
def test_unique_id_comes_last(self):
actual = find_unique_delivery_id([3, 3, 2, 2, 1])
expected = 1
self.assertEqual(actual, expected)
def test_unique_id_in_middle(self):
actual = find_unique_delivery_id([3, 2, 1, 2, 3])
expected = 1
self.assertEqual(actual, expected)
def test_many_drones(self):
actual = find_unique_delivery_id([2, 5, 4, 8, 6, 3, 1, 4, 2, 3, 6, 5, 1])
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Reset editor

Powered by qualified.io

. . .